A Robust Finite-state Parser for French
نویسنده
چکیده
This paper describes a robust nite-state parser implemented for French. The parser attaches morpho-syntactic tags to each word and determines clause boundaries. It is a reductionist parser based on nite-state networks and their intersection. We describe essential elements of the rule writing system, and show how it is actually applied to solve various phenomena, such as argument uniqueness, agreement or apposition. We show some results which indicate that the parser can parse technical manuals with high accuracy (in a test sample 95 % of part-of-speech and functional tags were correct). The average number of parses per sentence is very low, more than 92 % of sentences produce less than 4 parses, including the correct one. A test on very long sentences from newspaper corpora and a discussion of errors provide more insight into the parser.
منابع مشابه
Rules and Constraints in a French Finite-State Grammar
This report describes the rule system of a robust nite-state parser implemented for French. The parser attaches syntactic tags to each word as well as part-of-speech and morphological tags, and determines clause boundaries. It is a reductionist parser i.e. it removes readings from the originally ambiguous text. The underlying parser is based on nite-state networks and their intersection. We des...
متن کاملA Language-Independent Shallow-Parser Compiler
We present a rule−based shallow− parser compiler, which allows to generate a robust shallow−parser for any language, even in the absence of training data, by resorting to a very limited number of rules which aim at identifying constituent boundaries. We contrast our approach to other approaches used for shallow−parsing (i.e. finite−state and probabilistic methods). We present an evaluation of o...
متن کاملPROFER: predictive, robust finite-state parsing for spoken language
The natural languageprocessingcomponentof a speechunderstanding system is commonly a robust, semantic parser, implemented as either a chart-based transition network, or as a generalized leftright (GLR) parser. In contrast, we are developing a robust, semantic parser that is a single, predictive finite-state machine. Our approach is motivated by our belief that such a finite-state parser can ult...
متن کاملComparative Study of GLR Parser with Finite-state Predictors and Chart-based Semantic Parsers
The natural language processing component of a speech understanding system is commonly a robust, semantic parser, implemented as either a chart-based transition network, or as a generalized left right (GLR) parser. In contrast, we are developing a robust, semantic parser that is a single, predictive finite-state machine. Our approach is motivated by our belief that such a finite-state parser ca...
متن کاملA Robust Parser for Unrestricted Greek Text
In this paper we describe a method for the efficient parsing of real-life Greek texts at the surface syntactic level. A grammar consisting of non-recursive regular expressions describing Greek phrase structure has been compiled into a cascade of finite state transducers used to recognize syntactic constituents. The implemented parser lends itself to applications where large scale text processin...
متن کامل